General notes on project: - max of 15 pages
Our dataset originates from the most recent (updated in 2015) version of the Canadian Nutrient File. The database contains average values for nutrients in foods available in Canada. These averages are based on the generic versions of a food, unless there is a brand specifically included in the database. This is a bilingual dataset with food names, descriptions, and background information that are in both French and English. One of the major goals in creating this version of the dataset was to update nutrient values for foods that are the largest contributors of sodium to the diet, since one of the major goals of manufacturers is to reduce sodium content of foods.
To this end, the database assesses more than 5690 unique foods, ranging from foods such as Cheese souffle to Vanilla extract, and provides the average nutrient levels per 100 grams.
Notably, some of the nutrients included are subcomponents of other nutrients. Or, like the two metrics for food energy (i.e., kcal and kJ), they are just different ways of measuring the same thing (i.e., one kcal = 4.184 kJ)
Also, it is worth noting that there are many values that are missing. For instance, only 0.982% of the rows have values for biotin (see Missing values in clean dataset section)
The original dataset was a relational dataset, with unique identifiers for the main variables of interest. Therefore, we had to merge on the unique identifiers (e.g., FoodID) across the relational datasets using the left_join function. Then, we
The original dataset was cleaned for analyses in the following ways:
(see here for a full summary of the variables in the cleaned dataset). All of the subsequent summary statistics and analyses mentioned in the paper will be using the clean dataset.
description of file contents: From FOOD NAME.csv: FoodID (merging var) FoodGroupID (merging var) FoodDescription From NUTRIENT NAME.csv: NutrientID (merging var) NutrientName NutrientUnit (possibly as control??) From NUTRIENT AMOUNT.csv: NutrientValue FoodID (merging var) NutrientID (merging var) From FOOD GROUP.csv FoodGroupID (merging var) FoodGroupName
Main file of interest is NUTRIENT AMOUNT.csv – contains long version of dataset, where there are multiple rows for each food, along with a col for the nutrient identifier and the nutrient value associated with that identifier
Will have to combine the multiple datasets in one to have all info in one place
“At present foods are grouped under 23 different group headings based on similar characteristics of the foods”
The foodnamesare only available in this version in one lengthwhich does not include abbreviationsand can be up to 255 characters long
All of the nutrient data is stored per 100g of the food (edible portion) - that is, all nutritional data is on the same scale
in cleaning - only selected variables of interest (e.g., we removed mean SE for the nutrient values, since there were many missing values)
result report: wordcloud of each cluster based on most frequent names in food names
Further Idea: for better interpretation of data… theoretically narrow down nutrients of interest (e.g. Energy, Carb, Protein, Fat, Sugar, Vitamin, Cholestrol) and run the same analyses?
## [1] "Size of each cluster is" "9"
## [3] "21" "9"
## [5] "23" "97"
## [7] "1785" "3687"
## [9] "59"
###### Plotting : wordcloud
###### Mean calories per cluster
##### Most important nutrients used for spectrum clustering (PC1, PC2)
###### Run k-means clustering
###### Mean calories per cluster
## [1] "Size of each cluster is" "9"
## [3] "21" "9"
## [5] "23" "97"
## [7] "1785" "3687"
## [9] "59"
###### Plotting : wordcloud of each cluster
###### Mean calories per cluster
##### Most important nutrients used for spectrum clustering (PC1, PC2)
###### Run k-means clustering
###### Mean calories per cluster
##### Most important nutrients used for spectrum clustering (PC1, PC2)
###### Run k-means clustering
###### Mean calories per cluster
# Methods Parking Lot We will run spectrum clustering to see which foods are grouped together based on their nutrient profile. In doing so, we want to find out whether we can find clear-cut clusters of foods based on nutrition profile regardless of actual food group assignment. Our grouping results can be used to find the list of foods that contain the best combination of nutrients based on a person’s dietary needs. This improves upon diets that try to rely on the actual food grouping, since our grouping result will be a better representation of the nutrient profile of a set of foods than the food group label, which may be determined arbitrarily. Since we have large dimensions of data (153 nutrient information for 5690 unique foods), we are going to run principal component analysis (PCA) to
How to deal with NAs? : want to run PCA and find the ideal number of PCs that will be used for spectrum clustering. can’t run prcomp with NAs.(if use na.omit – no single food will survive – recode all NAs to 0 for prelim analyses) – mean imputation
prcomp : scale and center each column? – prcomp.default(nutrient_only, scale = T, center = T) : cannot rescale a constant/zero column to unit variance
remove both? i like the idea of predicting average calories of each cluster that we will assign –> any recommended methods to do this? just report average calories of each cluster?
use tab_model or stargazer for showing regressions (see previous write ups for examples)
use silhouette method from lecture on clustering to decide on a number of clusters
just write the code to create the models & figures & Keana will incorporate the results into the results/appendix section - no need to write out the stat output from models - just a symmary of what they generally show & logic will be helpful
last time there was writing between the code - which made the formatting look a little weird (ie weird spacing between paragaphs where the code was). this time, let’s have all of the code at the end of a section so the formatting doesn’t get messed up
possibly use this package for assessing model fit (if included): https://github.com/easystats/performance What are the words in food description that predict the cluster they might be in?
chrome-extension://oemmndcbldboiebfnladdacbdfmadadm/file:///C:/Users/keana/OneDrive%20-%20PennO365/Comp_transfer2018/Penn/fourth_yr/nutrients_project_stat571/cnf-fcen-csv/CNF%202015%20users_guide%20EN.pdf - The CNF is particularly suited for assessment of diets, recipe development, menu planning when ingredients or menu items are not specific and for population nutrition surveillance activities, where nutrient intake distributions areused to conduct risk assessments such as modeling for fortification proposals. It is also useful in the initial stages of product development to ensure that nutritional targets can be met.
chrome-extension://oemmndcbldboiebfnladdacbdfmadadm/file:///C:/Users/keana/OneDrive%20-%20PennO365/Comp_transfer2018/Penn/fourth_yr/nutrients_project_stat571/cnf-fcen-csv/CNF%202015%20users_guide%20EN.pdf - The exact nutrient composition of a specific apple or cookie isnot found on the CNF. These averages, except where indicated otherwise, take into account sources of a given food across Canada. Local foods may have a different profile than the national average. - Most users are lookingfor an average or mean value for a generic representation of the foods as described. These generic values have been derived from combining brands of similar products, for example all major brands of ketchup; various varieties of oranges or similar beef cuts from various producers.
- This dataset is only relevant to products available in Canada - so the results cannot be generalized to products from other countries. Therefore future research should explore whether these findings replicate among products in other countries. - the nutrient values are all standardized, not representative of how much a person may actually consume in a package - would need to convert to nutrient values for the actual portions people eat to be interpretable
Dimensions: 5690 x 154
Duplicates: 0
| No | Variable | Stats / Values | Freqs (% of Valid) | Graph | Missing |
|---|---|---|---|---|---|
| 1 | FoodGroupName [factor] |
1. Babyfoods 2. Baked Products 3. Beef Products 4. Beverages 5. Breakfast cereals 6. Cereals, Grains and Pasta 7. Dairy and Egg Products 8. Fast Foods 9. Fats and Oils 10. Finfish and Shellfish Pro [ 13 others ] |
94 ( 1.7%) 441 ( 7.8%) 170 ( 3.0%) 243 ( 4.3%) 212 ( 3.7%) 155 ( 2.7%) 241 ( 4.2%) 174 ( 3.1%) 144 ( 2.5%) 325 ( 5.7%) 3491 (61.4%) |
0 (0.0%) |
|
| 2 | PROTEIN [numeric] |
Mean (sd) : 11.1 (10.8) min < med < max: 0 < 7.6 < 85.6 IQR (CV) : 16.6 (1) |
2261 distinct values | 0 (0.0%) |
|
| 3 | FAT (TOTAL LIPIDS) [numeric] |
Mean (sd) : 10 (16.7) min < med < max: 0 < 3.8 < 100 IQR (CV) : 11.6 (1.7) |
1913 distinct values | 0 (0.0%) |
|
| 4 | CARBOHYDRATE, TOTAL (BY DIFFERENCE) [numeric] |
Mean (sd) : 22 (26.5) min < med < max: 0 < 10.3 < 100 IQR (CV) : 31.6 (1.2) |
2756 distinct values | 0 (0.0%) |
|
| 5 | ASH, TOTAL [numeric] |
Mean (sd) : 1.9 (3.5) min < med < max: 0 < 1.2 < 99.8 IQR (CV) : 1.2 (1.8) |
646 distinct values | 1 (0.0%) |
|
| 6 | ENERGY (KILOCALORIES) [numeric] |
Mean (sd) : 219 (174) min < med < max: 0 < 174 < 902 IQR (CV) : 240 (0.8) |
665 distinct values | 0 (0.0%) |
|
| 7 | ALCOHOL [numeric] |
Mean (sd) : 0.1 (1.8) min < med < max: 0 < 0 < 42.5 IQR (CV) : 0 (14.7) |
37 distinct values | 325 (5.7%) |
|
| 8 | MOISTURE [numeric] |
Mean (sd) : 55 (31) min < med < max: 0 < 64.7 < 100 IQR (CV) : 50.3 (0.6) |
3417 distinct values | 0 (0.0%) |
|
| 9 | CAFFEINE [numeric] |
Mean (sd) : 3.9 (101) min < med < max: 0 < 0 < 5714 IQR (CV) : 0 (25.6) |
61 distinct values | 312 (5.5%) |
|
| 10 | THEOBROMINE [numeric] |
Mean (sd) : 7 (77.1) min < med < max: 0 < 0 < 2634 IQR (CV) : 0 (11.1) |
130 distinct values | 338 (5.9%) |
|
| 11 | ENERGY (KILOJOULES) [numeric] |
Mean (sd) : 915 (727) min < med < max: 0 < 727 < 3774 IQR (CV) : 1006 (0.8) |
1658 distinct values | 1 (0.0%) |
|
| 12 | SUGARS, TOTAL [numeric] |
Mean (sd) : 7.7 (15) min < med < max: 0 < 1.3 < 99.8 IQR (CV) : 7.6 (1.9) |
1505 distinct values | 1046 (18.4%) |
|
| 13 | FIBRE, TOTAL DIETARY [numeric] |
Mean (sd) : 2.4 (4.8) min < med < max: 0 < 0.8 < 79 IQR (CV) : 2.8 (2) |
237 distinct values | 224 (3.9%) |
|
| 14 | CALCIUM [numeric] |
Mean (sd) : 76.9 (220) min < med < max: 0 < 24 < 7364 IQR (CV) : 60 (2.9) |
468 distinct values | 51 (0.9%) |
|
| 15 | IRON [numeric] |
Mean (sd) : 2.6 (5.6) min < med < max: 0 < 1.1 < 124 IQR (CV) : 2.1 (2.2) |
882 distinct values | 51 (0.9%) |
|
| 16 | MAGNESIUM [numeric] |
Mean (sd) : 39.7 (64.8) min < med < max: 0 < 21 < 781 IQR (CV) : 23 (1.6) |
302 distinct values | 214 (3.8%) |
|
| 17 | PHOSPHORUS [numeric] |
Mean (sd) : 168 (236) min < med < max: 0 < 130 < 9918 IQR (CV) : 176 (1.4) |
620 distinct values | 153 (2.7%) |
|
| 18 | POTASSIUM [numeric] |
Mean (sd) : 308 (447) min < med < max: 0 < 232 < 16500 IQR (CV) : 215 (1.5) |
895 distinct values | 165 (2.9%) |
|
| 19 | SODIUM [numeric] |
Mean (sd) : 333 (1219) min < med < max: 0 < 82 < 38758 IQR (CV) : 339 (3.7) |
1099 distinct values | 43 (0.8%) |
|
| 20 | ZINC [numeric] |
Mean (sd) : 1.6 (3) min < med < max: 0 < 0.8 < 91 IQR (CV) : 1.8 (1.9) |
695 distinct values | 220 (3.9%) |
|
| 21 | COPPER [numeric] |
Mean (sd) : 0.2 (0.6) min < med < max: 0 < 0.1 < 15.1 IQR (CV) : 0.2 (2.8) |
787 distinct values | 270 (4.7%) |
|
| 22 | MANGANESE [numeric] |
Mean (sd) : 0.6 (3.7) min < med < max: 0 < 0.1 < 133 IQR (CV) : 0.4 (6.1) |
1226 distinct values | 585 (10.3%) |
|
| 23 | SELENIUM [numeric] |
Mean (sd) : 14.6 (36.7) min < med < max: 0 < 6.9 < 1917 IQR (CV) : 19.4 (2.5) |
614 distinct values | 722 (12.7%) |
|
| 24 | RETINOL [numeric] |
Mean (sd) : 88.8 (840) min < med < max: 0 < 0 < 30000 IQR (CV) : 11 (9.5) |
326 distinct values | 499 (8.8%) |
|
| 25 | BETA CAROTENE [numeric] |
Mean (sd) : 292 (1711) min < med < max: 0 < 0 < 42891 IQR (CV) : 33 (5.9) |
612 distinct values | 653 (11.5%) |
|
| 26 | ALPHA-TOCOPHEROL [numeric] |
Mean (sd) : 1.2 (4.1) min < med < max: 0 < 0.3 < 149 IQR (CV) : 0.6 (3.5) |
447 distinct values | 1555 (27.3%) |
|
| 27 | VITAMIN D (INTERNATIONAL UNITS) [numeric] |
Mean (sd) : 23.9 (241) min < med < max: 0 < 0 < 12716 IQR (CV) : 6 (10.1) |
214 distinct values | 692 (12.2%) |
|
| 28 | VITAMIN D (D2 + D3) [numeric] |
Mean (sd) : 0.6 (6.3) min < med < max: 0 < 0 < 318 IQR (CV) : 0.2 (10) |
129 distinct values | 690 (12.1%) |
|
| 29 | VITAMIN C [numeric] |
Mean (sd) : 8.2 (53.2) min < med < max: 0 < 0.1 < 1900 IQR (CV) : 3.6 (6.5) |
458 distinct values | 184 (3.2%) |
|
| 30 | THIAMIN [numeric] |
Mean (sd) : 0.2 (0.6) min < med < max: 0 < 0.1 < 23.4 IQR (CV) : 0.2 (2.6) |
812 distinct values | 280 (4.9%) |
|
| 31 | RIBOFLAVIN [numeric] |
Mean (sd) : 0.2 (0.5) min < med < max: 0 < 0.1 < 17.5 IQR (CV) : 0.2 (2.1) |
709 distinct values | 261 (4.6%) |
|
| 32 | NIACIN (NICOTINIC ACID) PREFORMED [numeric] |
Mean (sd) : 3.1 (4.4) min < med < max: 0 < 1.6 < 128 IQR (CV) : 4.4 (1.4) |
2828 distinct values | 234 (4.1%) |
|
| 33 | TOTAL NIACIN EQUIVALENT [numeric] |
Mean (sd) : 5.2 (5.6) min < med < max: 0 < 3.5 < 132 IQR (CV) : 7.2 (1.1) |
3908 distinct values | 234 (4.1%) |
|
| 34 | PANTOTHENIC ACID [numeric] |
Mean (sd) : 0.6 (0.9) min < med < max: 0 < 0.4 < 21.9 IQR (CV) : 0.7 (1.5) |
1316 distinct values | 936 (16.4%) |
|
| 35 | VITAMIN B-6 [numeric] |
Mean (sd) : 0.2 (1) min < med < max: 0 < 0.1 < 68.8 IQR (CV) : 0.2 (4.4) |
756 distinct values | 397 (7.0%) |
|
| 36 | TOTAL FOLACIN [numeric] |
Mean (sd) : 37.7 (93.4) min < med < max: 0 < 12 < 3786 IQR (CV) : 35 (2.5) |
290 distinct values | 408 (7.2%) |
|
| 37 | VITAMIN B-12 [numeric] |
Mean (sd) : 1.1 (6.8) min < med < max: 0 < 0 < 380 IQR (CV) : 0.7 (6.1) |
899 distinct values | 354 (6.2%) |
|
| 38 | VITAMIN K [numeric] |
Mean (sd) : 20.8 (99.9) min < med < max: 0 < 1.7 < 1714 IQR (CV) : 6 (4.8) |
434 distinct values | 2516 (44.2%) |
|
| 39 | FOLIC ACID [numeric] |
Mean (sd) : 8.4 (49.1) min < med < max: 0 < 0 < 2993 IQR (CV) : 0 (5.8) |
160 distinct values | 160 (2.8%) |
|
| 40 | TRYPTOPHAN [numeric] |
Mean (sd) : 0.1 (0.1) min < med < max: 0 < 0.1 < 1.6 IQR (CV) : 0.2 (0.9) |
458 distinct values | 1835 (32.2%) |
|
| 41 | THREONINE [numeric] |
Mean (sd) : 0.5 (0.5) min < med < max: 0 < 0.3 < 3.7 IQR (CV) : 0.8 (0.9) |
1300 distinct values | 1782 (31.3%) |
|
| 42 | ISOLEUCINE [numeric] |
Mean (sd) : 0.6 (0.5) min < med < max: 0 < 0.4 < 5 IQR (CV) : 0.8 (0.9) |
1369 distinct values | 1778 (31.2%) |
|
| 43 | LEUCINE [numeric] |
Mean (sd) : 1 (0.9) min < med < max: 0 < 0.7 < 7.2 IQR (CV) : 1.4 (0.9) |
1834 distinct values | 1782 (31.3%) |
|
| 44 | LYSINE [numeric] |
Mean (sd) : 0.9 (0.9) min < med < max: 0 < 0.4 < 5.8 IQR (CV) : 1.6 (1) |
1698 distinct values | 1764 (31.0%) |
|
| 45 | METHIONINE [numeric] |
Mean (sd) : 0.3 (0.3) min < med < max: 0 < 0.2 < 3.2 IQR (CV) : 0.5 (1) |
859 distinct values | 1767 (31.1%) |
|
| 46 | CYSTINE [numeric] |
Mean (sd) : 0.2 (0.2) min < med < max: 0 < 0.1 < 2.1 IQR (CV) : 0.2 (1) |
495 distinct values | 1842 (32.4%) |
|
| 47 | PHENYLALANINE [numeric] |
Mean (sd) : 0.5 (0.5) min < med < max: 0 < 0.5 < 5.2 IQR (CV) : 0.7 (0.9) |
1274 distinct values | 1782 (31.3%) |
|
| 48 | TYROSINE [numeric] |
Mean (sd) : 0.4 (0.4) min < med < max: 0 < 0.3 < 3.3 IQR (CV) : 0.6 (0.9) |
1131 distinct values | 1811 (31.8%) |
|
| 49 | VALINE [numeric] |
Mean (sd) : 0.6 (0.6) min < med < max: 0 < 0.4 < 6.2 IQR (CV) : 0.9 (0.9) |
1428 distinct values | 1778 (31.2%) |
|
| 50 | ARGININE [numeric] |
Mean (sd) : 0.8 (0.8) min < med < max: 0 < 0.5 < 7.4 IQR (CV) : 1.2 (1) |
1626 distinct values | 1791 (31.5%) |
|
| 51 | HISTIDINE [numeric] |
Mean (sd) : 0.4 (0.4) min < med < max: 0 < 0.2 < 2.3 IQR (CV) : 0.6 (1) |
1075 distinct values | 1784 (31.4%) |
|
| 52 | ALANINE [numeric] |
Mean (sd) : 0.7 (0.7) min < med < max: 0 < 0.4 < 8 IQR (CV) : 1 (1) |
1489 distinct values | 1836 (32.3%) |
|
| 53 | ASPARTIC ACID [numeric] |
Mean (sd) : 1.2 (1.1) min < med < max: 0 < 0.8 < 10.2 IQR (CV) : 1.7 (0.9) |
1936 distinct values | 1850 (32.5%) |
|
| 54 | GLUTAMIC ACID [numeric] |
Mean (sd) : 2.4 (12.3) min < med < max: 0 < 1.9 < 757 IQR (CV) : 2.9 (5.2) |
2432 distinct values | 1833 (32.2%) |
|
| 55 | GLYCINE [numeric] |
Mean (sd) : 0.6 (0.7) min < med < max: 0 < 0.4 < 19 IQR (CV) : 1 (1.1) |
1438 distinct values | 1835 (32.2%) |
|
| 56 | PROLINE [numeric] |
Mean (sd) : 0.7 (0.6) min < med < max: 0 < 0.6 < 12.3 IQR (CV) : 0.8 (1) |
1419 distinct values | 1843 (32.4%) |
|
| 57 | SERINE [numeric] |
Mean (sd) : 0.5 (0.5) min < med < max: 0 < 0.5 < 6.1 IQR (CV) : 0.7 (0.9) |
1288 distinct values | 1844 (32.4%) |
|
| 58 | CHOLESTEROL [numeric] |
Mean (sd) : 41.5 (138) min < med < max: 0 < 1 < 3100 IQR (CV) : 61 (3.3) |
291 distinct values | 194 (3.4%) |
|
| 59 | FATTY ACIDS, TRANS, TOTAL [numeric] |
Mean (sd) : 0.3 (1.7) min < med < max: 0 < 0 < 37.6 IQR (CV) : 0.2 (5.9) |
498 distinct values | 3559 (62.5%) |
|
| 60 | FATTY ACIDS, SATURATED, TOTAL [numeric] |
Mean (sd) : 3.1 (5.8) min < med < max: 0 < 1.1 < 95.6 IQR (CV) : 3.5 (1.9) |
2812 distinct values | 238 (4.2%) |
|
| 61 | FATTY ACIDS, SATURATED, 8:0, OCTANOIC [numeric] |
Mean (sd) : 0 (0.2) min < med < max: 0 < 0 < 7.5 IQR (CV) : 0 (7.1) |
266 distinct values | 1668 (29.3%) |
|
| 62 | FATTY ACIDS, SATURATED, 10:0, DECANOIC [numeric] |
Mean (sd) : 0 (0.2) min < med < max: 0 < 0 < 6 IQR (CV) : 0 (5.1) |
350 distinct values | 1364 (24.0%) |
|
| 63 | FATTY ACIDS, SATURATED, 12:0, DODECANOIC [numeric] |
Mean (sd) : 0.2 (1.7) min < med < max: 0 < 0 < 47 IQR (CV) : 0 (8.7) |
448 distinct values | 1201 (21.1%) |
|
| 64 | FATTY ACIDS, SATURATED, 14:0, TETRADECANOIC [numeric] |
Mean (sd) : 0.2 (0.9) min < med < max: 0 < 0 < 22.8 IQR (CV) : 0.2 (3.5) |
787 distinct values | 788 (13.8%) |
|
| 65 | FATTY ACIDS, SATURATED, 16:0, HEXADECANOIC [numeric] |
Mean (sd) : 1.7 (2.8) min < med < max: 0 < 0.7 < 43.5 IQR (CV) : 2.1 (1.7) |
2322 distinct values | 602 (10.6%) |
|
| 66 | FATTY ACIDS, SATURATED, 18:0, OCTADECANOIC [numeric] |
Mean (sd) : 0.8 (1.7) min < med < max: 0 < 0.3 < 33.2 IQR (CV) : 0.9 (2) |
1675 distinct values | 615 (10.8%) |
|
| 67 | FATTY ACIDS, MONOUNSATURATED, 18:1undifferentiated, OCTADECENOIC [numeric] |
Mean (sd) : 3.5 (7.2) min < med < max: 0 < 1 < 82.6 IQR (CV) : 4 (2) |
2708 distinct values | 578 (10.2%) |
|
| 68 | FATTY ACIDS, POLYUNSATURATED, 18:2undifferentiated, LINOLEIC, OCTADECADIENOIC [numeric] |
Mean (sd) : 1.8 (4.7) min < med < max: 0 < 0.4 < 74.6 IQR (CV) : 1.5 (2.6) |
2079 distinct values | 561 (9.9%) |
|
| 69 | FATTY ACIDS, POLYUNSATURATED, 18:3undifferentiated, LINOLENIC, OCTADECATRIENOIC [numeric] |
Mean (sd) : 0.2 (1.2) min < med < max: 0 < 0.1 < 53.4 IQR (CV) : 0.1 (6) |
689 distinct values | 656 (11.5%) |
|
| 70 | FATTY ACIDS, POLYUNSATURATED, 20:4, EICOSATETRAENOIC [numeric] |
Mean (sd) : 0 (0.1) min < med < max: 0 < 0 < 1.8 IQR (CV) : 0 (2.6) |
261 distinct values | 1210 (21.3%) |
|
| 71 | FATTY ACIDS, POLYUNSATURATED, 22:6 n-3, DOCOSAHEXAENOIC (DHA) [numeric] |
Mean (sd) : 0 (0.5) min < med < max: 0 < 0 < 18.2 IQR (CV) : 0 (9.6) |
296 distinct values | 137 (2.4%) |
|
| 72 | FATTY ACIDS, MONOUNSATURATED, 16:1undifferentiated, HEXADECENOIC [numeric] |
Mean (sd) : 0.2 (1) min < med < max: 0 < 0 < 18.9 IQR (CV) : 0.2 (4) |
770 distinct values | 836 (14.7%) |
|
| 73 | FATTY ACIDS, POLYUNSATURATED, 18:4, OCTADECATETRAENOIC [numeric] |
Mean (sd) : 0 (0.1) min < med < max: 0 < 0 < 3 IQR (CV) : 0 (10.2) |
126 distinct values | 1781 (31.3%) |
|
| 74 | FATTY ACIDS, POLYUNSATURATED, 20:5 n-3, EICOSAPENTAENOIC (EPA) [numeric] |
Mean (sd) : 0 (0.4) min < med < max: 0 < 0 < 13.2 IQR (CV) : 0 (9.7) |
259 distinct values | 1306 (23.0%) |
|
| 75 | FATTY ACIDS, MONOUNSATURATED, 22:1undifferentiated, DOCOSENOIC [numeric] |
Mean (sd) : 0 (0.8) min < med < max: 0 < 0 < 41.2 IQR (CV) : 0 (17.3) |
199 distinct values | 1532 (26.9%) |
|
| 76 | FATTY ACIDS, POLYUNSATURATED, 22:5 n-3, DOCOSAPENTAENOIC (DPA) [numeric] |
Mean (sd) : 0 (0.2) min < med < max: 0 < 0 < 5.6 IQR (CV) : 0 (11.1) |
162 distinct values | 149 (2.6%) |
|
| 77 | FATTY ACIDS, MONOUNSATURATED, TOTAL [numeric] |
Mean (sd) : 3.9 (7.8) min < med < max: 0 < 1.2 < 83.7 IQR (CV) : 4.5 (2) |
2880 distinct values | 314 (5.5%) |
|
| 78 | FATTY ACIDS, POLYUNSATURATED, TOTAL [numeric] |
Mean (sd) : 2.2 (5.2) min < med < max: 0 < 0.6 < 74.6 IQR (CV) : 1.8 (2.4) |
2381 distinct values | 316 (5.6%) |
|
| 79 | NATURALLY OCCURRING FOLATE [numeric] |
Mean (sd) : 29.2 (75.3) min < med < max: 0 < 9 < 2340 IQR (CV) : 20 (2.6) |
261 distinct values | 503 (8.8%) |
|
| 80 | RETINOL ACTIVITY EQUIVALENTS [numeric] |
Mean (sd) : 115 (836) min < med < max: 0 < 3 < 30000 IQR (CV) : 33 (7.3) |
463 distinct values | 260 (4.6%) |
|
| 81 | DIETARY FOLATE EQUIVALENTS [numeric] |
Mean (sd) : 44.4 (119) min < med < max: 0 < 12 < 5881 IQR (CV) : 41 (2.7) |
332 distinct values | 499 (8.8%) |
|
| 82 | FATTY ACIDS, POLYUNSATURATED, 18:2 c,c n-6, LINOLEIC, OCTADECADIENOIC [numeric] |
Mean (sd) : 2.3 (6) min < med < max: 0 < 0.5 < 74.6 IQR (CV) : 1.5 (2.6) |
1285 distinct values | 3256 (57.2%) |
|
| 83 | FATTY ACIDS, POLYUNSATURATED, 20:3, EICOSATRIENOIC [numeric] |
Mean (sd) : 0 (0) min < med < max: 0 < 0 < 1.4 IQR (CV) : 0 (7.1) |
89 distinct values | 2322 (40.8%) |
|
| 84 | FATTY ACIDS, POLYUNSATURATED, 18:3 c,c,c n-3 LINOLENIC, OCTADECATRIENOIC [numeric] |
Mean (sd) : 0.2 (1.3) min < med < max: 0 < 0 < 53.4 IQR (CV) : 0.1 (6.2) |
619 distinct values | 954 (16.8%) |
|
| 85 | FATTY ACIDS, POLYUNSATURATED, 18:3 c,c,c n-6, g-LINOLENIC, OCTADECATRIENOIC [numeric] |
Mean (sd) : 0 (0) min < med < max: 0 < 0 < 1 IQR (CV) : 0 (18.1) |
49 distinct values | 307 (5.4%) |
|
| 86 | BETA CRYPTOXANTHIN [numeric] |
Mean (sd) : 15.2 (198) min < med < max: 0 < 0 < 6252 IQR (CV) : 0 (13) |
128 distinct values | 2334 (41.0%) |
|
| 87 | LYCOPENE [numeric] |
Mean (sd) : 220 (1807) min < med < max: 0 < 0 < 46260 IQR (CV) : 0 (8.2) |
190 distinct values | 2324 (40.8%) |
|
| 88 | LUTEIN AND ZEAXANTHIN [numeric] |
Mean (sd) : 260 (1387) min < med < max: 0 < 0 < 19697 IQR (CV) : 39 (5.3) |
419 distinct values | 2346 (41.2%) |
|
| 89 | FATTY ACIDS, POLYUNSATURATED, 20:3 n-6, EICOSATRIENOIC [numeric] |
Mean (sd) : 0 (0) min < med < max: 0 < 0 < 1.4 IQR (CV) : 0 (14.6) |
71 distinct values | 521 (9.2%) |
|
| 90 | FATTY ACIDS, POLYUNSATURATED, 20:4 n-6, ARACHIDONIC [numeric] |
Mean (sd) : 0 (0.1) min < med < max: 0 < 0 < 1.8 IQR (CV) : 0 (2.7) |
227 distinct values | 2625 (46.1%) |
|
| 91 | FATTY ACIDS, POLYUNSATURATED, 20:3 n-3 EICOSATRIENOIC [numeric] |
Mean (sd) : 0 (0) min < med < max: 0 < 0 < 1 IQR (CV) : 0 (16.7) |
52 distinct values | 505 (8.9%) |
|
| 92 | VITAMIN B12, ADDED [numeric] |
Mean (sd) : 1 (17.5) min < med < max: 0 < 0 < 380 IQR (CV) : 0 (18.1) |
28 distinct values | 5218 (91.7%) |
|
| 93 | ALPHA-TOCOPHEROL, ADDED [numeric] |
Mean (sd) : 0.1 (0.9) min < med < max: 0 < 0 < 16.9 IQR (CV) : 0 (12) |
11 distinct values | 5231 (91.9%) |
|
| 94 | VITAMIN D2, ERGOCALCIFEROL [numeric] |
Mean (sd) : 0.3 (2) min < med < max: 0 < 0 < 28.1 IQR (CV) : 0 (6.3) |
22 distinct values | 5344 (93.9%) |
|
| 95 | FATTY ACIDS, SATURATED, 4:0, BUTANOIC [numeric] |
Mean (sd) : 0 (0.2) min < med < max: 0 < 0 < 3.2 IQR (CV) : 0 (5) |
274 distinct values | 1839 (32.3%) |
|
| 96 | FATTY ACIDS, SATURATED, 6:0, HEXANOIC [numeric] |
Mean (sd) : 0 (0.1) min < med < max: 0 < 0 < 2 IQR (CV) : 0 (4.9) |
224 distinct values | 1816 (31.9%) |
|
| 97 | ALPHA CAROTENE [numeric] |
Mean (sd) : 40.8 (387) min < med < max: 0 < 0 < 14251 IQR (CV) : 0 (9.5) |
164 distinct values | 2340 (41.1%) |
|
| 98 | FATTY ACIDS, MONOUNSATURATED, 22:1c, DOCOSENOIC [numeric] |
Mean (sd) : 0 (0) min < med < max: 0 < 0 < 1.1 IQR (CV) : 0 (5.7) |
100 distinct values | 2912 (51.2%) |
|
| 99 | FATTY ACIDS, POLYUNSATURATED, 18:3i, LINOLENIC, OCTADECATRIENOIC [numeric] |
Mean (sd) : 0 (0) min < med < max: 0 < 0 < 0.3 IQR (CV) : 0 (4.7) |
54 distinct values | 4419 (77.7%) |
|
| 100 | FATTY ACIDS, MONOUNSATURATED, 22:1t, DOCOSENOIC [numeric] |
Mean (sd) : 0 (0) min < med < max: 0 < 0 < 0.1 IQR (CV) : 0 (18.5) |
16 distinct values | 2983 (52.4%) |
|
| 101 | SUCROSE [numeric] |
Mean (sd) : 2 (7.3) min < med < max: 0 < 0 < 99.8 IQR (CV) : 0.4 (3.7) |
487 distinct values | 3044 (53.5%) |
|
| 102 | GLUCOSE [numeric] |
Mean (sd) : 0.8 (2.5) min < med < max: 0 < 0 < 35.8 IQR (CV) : 0.5 (3.2) |
399 distinct values | 3051 (53.6%) |
|
| 103 | FRUCTOSE [numeric] |
Mean (sd) : 0.7 (2.5) min < med < max: 0 < 0 < 55.6 IQR (CV) : 0.3 (3.6) |
387 distinct values | 3055 (53.7%) |
|
| 104 | LACTOSE [numeric] |
Mean (sd) : 0.3 (1.2) min < med < max: 0 < 0 < 13.2 IQR (CV) : 0 (4) |
225 distinct values | 3076 (54.1%) |
|
| 105 | MALTOSE [numeric] |
Mean (sd) : 0.2 (0.8) min < med < max: 0 < 0 < 16.4 IQR (CV) : 0 (3.9) |
217 distinct values | 3098 (54.4%) |
|
| 106 | GALACTOSE [numeric] |
Mean (sd) : 0 (0.5) min < med < max: 0 < 0 < 19.9 IQR (CV) : 0 (14.1) |
53 distinct values | 3122 (54.9%) |
|
| 107 | FATTY ACIDS, SATURATED, 20:0, EICOSANOIC [numeric] |
Mean (sd) : 0 (0.2) min < med < max: 0 < 0 < 4.6 IQR (CV) : 0 (4.6) |
183 distinct values | 3649 (64.1%) |
|
| 108 | FATTY ACIDS, SATURATED, 22:0, DOCOSANOIC [numeric] |
Mean (sd) : 0 (0.1) min < med < max: 0 < 0 < 3.7 IQR (CV) : 0 (5.8) |
133 distinct values | 3691 (64.9%) |
|
| 109 | FATTY ACIDS, MONOUNSATURATED, 14:1, TETRADECENOIC [numeric] |
Mean (sd) : 0 (0.1) min < med < max: 0 < 0 < 1.8 IQR (CV) : 0 (4) |
156 distinct values | 3666 (64.4%) |
|
| 110 | FATTY ACIDS, MONOUNSATURATED, 20:1, EICOSENOIC [numeric] |
Mean (sd) : 0.1 (0.6) min < med < max: 0 < 0 < 15 IQR (CV) : 0 (6.3) |
365 distinct values | 1759 (30.9%) |
|
| 111 | FATTY ACIDS, SATURATED, 15:0, PENTADECANOIC [numeric] |
Mean (sd) : 0 (0) min < med < max: 0 < 0 < 0.9 IQR (CV) : 0 (2.9) |
121 distinct values | 3772 (66.3%) |
|
| 112 | FATTY ACIDS, SATURATED, 17:0, HEPTADECANOIC [numeric] |
Mean (sd) : 0 (0.1) min < med < max: 0 < 0 < 0.8 IQR (CV) : 0 (2) |
189 distinct values | 3723 (65.4%) |
|
| 113 | FATTY ACIDS, SATURATED, 24:0, TETRACOSANOIC [numeric] |
Mean (sd) : 0 (0.1) min < med < max: 0 < 0 < 4.7 IQR (CV) : 0 (9.4) |
91 distinct values | 3949 (69.4%) |
|
| 114 | STARCH [numeric] |
Mean (sd) : 4 (11.4) min < med < max: 0 < 0 < 73.3 IQR (CV) : 0 (2.9) |
360 distinct values | 3755 (66.0%) |
|
| 115 | BETA-TOCOPHEROL [numeric] |
Mean (sd) : 0.1 (0.5) min < med < max: 0 < 0 < 10.5 IQR (CV) : 0.1 (5.1) |
65 distinct values | 4929 (86.6%) |
|
| 116 | GAMMA-TOCOPHEROL [numeric] |
Mean (sd) : 2.3 (5.7) min < med < max: 0 < 0.2 < 65.2 IQR (CV) : 1.7 (2.5) |
274 distinct values | 4922 (86.5%) |
|
| 117 | DELTA-TOCOPHEROL [numeric] |
Mean (sd) : 0.4 (1.3) min < med < max: 0 < 0 < 15.4 IQR (CV) : 0.2 (3.2) |
148 distinct values | 4928 (86.6%) |
|
| 118 | FATTY ACIDS, MONOUNSATURATED, 16:1t, HEXADECENOIC [numeric] |
Mean (sd) : 0 (0.1) min < med < max: 0 < 0 < 6.1 IQR (CV) : 0 (19.6) |
73 distinct values | 3959 (69.6%) |
|
| 119 | FATTY ACIDS, MONOUNSATURATED, 18:1t, OCTADECENOIC [numeric] |
Mean (sd) : 0.1 (0.7) min < med < max: 0 < 0 < 20.2 IQR (CV) : 0.1 (5.6) |
295 distinct values | 4118 (72.4%) |
|
| 120 | FATTY ACIDS, POLYUNSATURATED, 18:2i, LINOLEIC, OCTADECADIENOIC [numeric] |
Mean (sd) : 0 (0.1) min < med < max: 0 < 0 < 2.3 IQR (CV) : 0 (3.6) |
140 distinct values | 4332 (76.1%) |
|
| 121 | FATTY ACIDS, MONOUNSATURATED, 24:1c, TETRACOSENOIC [numeric] |
Mean (sd) : 0 (0) min < med < max: 0 < 0 < 0.6 IQR (CV) : 0 (7.9) |
45 distinct values | 4148 (72.9%) |
|
| 122 | FATTY ACIDS, MONOUNSATURATED, 16:1c, HEXADECENOIC [numeric] |
Mean (sd) : 0.1 (0.3) min < med < max: 0 < 0 < 6.9 IQR (CV) : 0.1 (2.4) |
396 distinct values | 3923 (68.9%) |
|
| 123 | FATTY ACIDS, POLYUNSATURATED, 20:2 c,c EICOSADIENOIC [numeric] |
Mean (sd) : 0 (0) min < med < max: 0 < 0 < 0.7 IQR (CV) : 0 (3.1) |
128 distinct values | 3854 (67.7%) |
|
| 124 | FATTY ACIDS, MONOUNSATURATED, 18:1c, OCTADECENOIC [numeric] |
Mean (sd) : 4.7 (72.2) min < med < max: 0 < 1.1 < 2845 IQR (CV) : 3.2 (15.5) |
1066 distinct values | 4132 (72.6%) |
|
| 125 | FATTY ACIDS, MONOUNSATURATED, 17:1, HEPTADECENOIC [numeric] |
Mean (sd) : 0 (0) min < med < max: 0 < 0 < 1.1 IQR (CV) : 0 (2.7) |
135 distinct values | 3903 (68.6%) |
|
| 126 | FATTY ACIDS, TOTAL TRANS-MONOENOIC [numeric] |
Mean (sd) : 0.1 (0.7) min < med < max: 0 < 0 < 20.2 IQR (CV) : 0.1 (6.2) |
285 distinct values | 4249 (74.7%) |
|
| 127 | FATTY ACIDS, MONOUNSATURATED, 15:1, PENTADECENOIC [numeric] |
Mean (sd) : 0 (0.2) min < med < max: 0 < 0 < 6 IQR (CV) : 0 (28) |
27 distinct values | 4050 (71.2%) |
|
| 128 | FATTY ACIDS, POLYUNSATURATED, CONJUGATED, 18:2 cla, LINOLEIC, OCTADECADIENOIC [numeric] |
Mean (sd) : 0 (0) min < med < max: 0 < 0 < 1.1 IQR (CV) : 0 (4.4) |
90 distinct values | 4331 (76.1%) |
|
| 129 | FATTY ACIDS, POLYUNSATURATED, 22:4 n-6, DOCOSATETRAENOIC [numeric] |
Mean (sd) : 0 (0) min < med < max: 0 < 0 < 0.3 IQR (CV) : 0 (3.1) |
66 distinct values | 4229 (74.3%) |
|
| 130 | FATTY ACIDS, TOTAL TRANS-POLYENOIC [numeric] |
Mean (sd) : 0 (0.1) min < med < max: 0 < 0 < 2.5 IQR (CV) : 0 (3.8) |
154 distinct values | 4330 (76.1%) |
|
| 131 | CHOLINE, TOTAL [numeric] |
Mean (sd) : 39.1 (70.6) min < med < max: 0 < 19 < 2403 IQR (CV) : 50.5 (1.8) |
910 distinct values | 2813 (49.4%) |
|
| 132 | BETAINE [numeric] |
Mean (sd) : 10.6 (31.5) min < med < max: 0 < 3.9 < 630 IQR (CV) : 10.1 (3) |
257 distinct values | 4589 (80.7%) |
|
| 133 | FATTY ACIDS, POLYUNSATURATED, TOTAL OMEGA N-3 [numeric] |
Mean (sd) : 0.5 (2.4) min < med < max: 0 < 0.1 < 53.4 IQR (CV) : 0.2 (4.9) |
548 distinct values | 3717 (65.3%) |
|
| 134 | FATTY ACIDS, POLYUNSATURATED, TOTAL OMEGA N-6 [numeric] |
Mean (sd) : 3.1 (23.3) min < med < max: 0 < 0.5 < 953 IQR (CV) : 1.4 (7.6) |
1055 distinct values | 3711 (65.2%) |
|
| 135 | ASPARTAME [numeric] |
Mean (sd) : 51.1 (403) min < med < max: 0 < 0 < 3722 IQR (CV) : 0 (7.9) |
0 : 82 (94.3%) 37 : 1 ( 1.1%) 42 : 1 ( 1.1%) 52 : 1 ( 1.1%) 597 : 1 ( 1.1%) 3722 : 1 ( 1.1%) |
5603 (98.5%) |
|
| 136 | TOTAL PLANT STEROL [numeric] |
Mean (sd) : 26.4 (80.7) min < med < max: 0 < 0 < 1190 IQR (CV) : 14 (3.1) |
117 distinct values | 4995 (87.8%) |
|
| 137 | MANNITOL [numeric] |
Mean (sd) : 0 (0) min < med < max: 0 < 0 < 0.2 IQR (CV) : 0 (17.6) |
3 distinct values | 4313 (75.8%) |
|
| 138 | SORBITOL [numeric] |
Mean (sd) : 0 (0.1) min < med < max: 0 < 0 < 2.3 IQR (CV) : 0 (14) |
9 distinct values | 4304 (75.6%) |
|
| 139 | STIGMASTEROL [numeric] |
Mean (sd) : 1.3 (5.9) min < med < max: 0 < 0 < 59 IQR (CV) : 0 (4.6) |
26 distinct values | 5183 (91.1%) |
|
| 140 | TOTAL MONOSACCARIDES [numeric] |
Mean (sd) : 0.8 (2.7) min < med < max: 0 < 0 < 30.6 IQR (CV) : 0.1 (3.3) |
267 distinct values | 3810 (67.0%) |
|
| 141 | TOTAL DISACCHARIDES [numeric] |
Mean (sd) : 1.5 (4.8) min < med < max: 0 < 0 < 47.2 IQR (CV) : 0 (3.3) |
295 distinct values | 3824 (67.2%) |
|
| 142 | BETA-SITOSTEROL [numeric] |
Mean (sd) : 14 (47.1) min < med < max: 0 < 0 < 621 IQR (CV) : 0 (3.4) |
54 distinct values | 5187 (91.2%) |
|
| 143 | HYDROXYPROLINE [numeric] |
Mean (sd) : 0.1 (0.1) min < med < max: 0 < 0 < 0.7 IQR (CV) : 0.2 (1.3) |
197 distinct values | 5083 (89.3%) |
|
| 144 | FATTY ACIDS, SATURATED, 13:0 TRIDECANOIC [numeric] |
Mean (sd) : 0 (0) min < med < max: 0 < 0 < 0.1 IQR (CV) : 0 (10.7) |
10 distinct values | 5241 (92.1%) |
|
| 145 | FATTY ACIDS, POLYUNSATURATED, 21:5 [numeric] |
Mean (sd) : 0 (0) min < med < max: 0 < 0 < 0.4 IQR (CV) : 0 (11.5) |
12 distinct values | 4734 (83.2%) |
|
| 146 | FATTY ACIDS, MONOUNSATURATED, 24:1undifferentiated, TETRACOSENOIC [numeric] |
Mean (sd) : 0 (0.1) min < med < max: 0 < 0 < 3 IQR (CV) : 0 (19.8) |
32 distinct values | 4550 (80.0%) |
|
| 147 | FATTY ACIDS, MONOUNSATURATED, 12:1, LAUROLEIC [numeric] |
1 distinct value | 0 : 351 (100.0%) | 5339 (93.8%) |
|
| 148 | FATTY ACIDS, POLYUNSATURATED, 22:3, [numeric] |
Mean (sd) : 0 (0) min < med < max: 0 < 0 < 0.1 IQR (CV) : 0 (13.6) |
10 distinct values | 4754 (83.6%) |
|
| 149 | FATTY ACIDS, POLYUNSATURATED, 22:2, DOCOSADIENOIC [numeric] |
Mean (sd) : 0 (0) min < med < max: 0 < 0 < 0 IQR (CV) : 0 (15.5) |
4 distinct values | 4690 (82.4%) |
|
| 150 | FATTY ACIDS, POLYUNSATURATED, 18:2t,t , OCTADECADIENENOIC [numeric] |
Mean (sd) : 0 (0) min < med < max: 0 < 0 < 0.5 IQR (CV) : 0 (5.9) |
59 distinct values | 4622 (81.2%) |
|
| 151 | CAMPESTEROL [numeric] |
Mean (sd) : 3.8 (16) min < med < max: 0 < 0 < 189 IQR (CV) : 0 (4.2) |
27 distinct values | 5400 (94.9%) |
|
| 152 | BIOTIN [numeric] |
Mean (sd) : 6.1 (6.5) min < med < max: 0 < 3.5 < 31.6 IQR (CV) : 7.2 (1.1) |
71 distinct values | 5585 (98.2%) |
|
| 153 | NA [numeric] |
1 distinct value | 1 distinct values | 5689 (100.0%) |
|
| 154 | OXALIC ACID [numeric] |
Mean (sd) : 0.3 (0.4) min < med < max: 0 < 0.1 < 1.7 IQR (CV) : 0.3 (1.4) |
27 distinct values | 5639 (99.1%) |
Analyses were conducted using the R Statistical language (version 3.6.0; R Core Team, 2019) on macOS Mojave 10.14.6, using the packages GGally (version 2.0.0; Barret Schloerke et al., 2020), gtsummary (version 1.3.6; Daniel Sjoberg et al., 2021), summarytools (version 0.9.8; Dominic Comtois, 2020), Matrix (version 1.2.17; Douglas Bates and Martin Maechler, 2019), RColorBrewer (version 1.1.2; Erich Neuwirth, 2014), ggplot2 (version 3.3.3; Wickham. ggplot2: Elegant Graphics for Data Analysis. Springer-Verlag New York, 2016.), tidyverse (version 1.2.1; Hadley Wickham, 2017), stringr (version 1.4.0; Hadley Wickham, 2019), tidyr (version 1.1.2; Hadley Wickham, 2020), forcats (version 0.5.1; Hadley Wickham, 2021), readr (version 1.3.1; Hadley Wickham, Jim Hester and Romain Francois, 2018), dplyr (version 1.0.2; Hadley Wickham et al., 2020), stargazer (version 5.2.2; Hlavac, Marek, 2018), wordcloud (version 2.6; Ian Fellows, 2018), tm (version 0.7.8; Ingo Feinerer and Kurt Hornik, 2020), glmnet (version 4.1.1; Jerome Friedman et al., 2010), car (version 3.0.3; John Fox and Sanford Weisberg, 2019), carData (version 3.0.2; John Fox, Sanford Weisberg and Brad Price, 2018), here (version 1.0.1; Kirill Müller, 2020), tibble (version 3.1.0; Kirill Müller and Hadley Wickham, 2021), NLP (version 0.2.1; Kurt Hornik, 2020), purrr (version 0.3.4; Lionel Henry and Hadley Wickham, 2020), sjPlot (version 2.8.6; Lüdecke D, 2020), report (version 0.3.0; Makowski et al., 2020), data.table (version 1.12.2; Matt Dowle and Arun Srinivasan, 2019), varhandle (version 2.0.5; Mehrad Mahmoudian, 2020), SnowballC (version 0.7.0; Milan Bouchet-Valat, 2020), pacman (version 0.5.1; Rinker et al., 2017), corrplot (version 0.84; Taiyun Wei and Viliam Simko, 2017) and pROC (version 1.17.0.1; Xavier Robin et al., 2011).
data.frame. R package version 1.12.2. https://CRAN.R-project.org/package=data.tableXavier Robin, Natacha Turck, Alexandre Hainard, Natalia Tiberti, Frédérique Lisacek, Jean-Charles Sanchez and Markus Müller (2011). pROC: an open-source package for R and S+ to analyze and compare ROC curves. BMC Bioinformatics, 12, p. 77. DOI: 10.1186/1471-2105-12-77 http://www.biomedcentral.com/1471-2105/12/77/
possibly use this to write functions